Chat Mining: Predicting User and Message Attributes in Computer-Mediated Communication
Tayfun Kucukyilmaz
Ph.D Student
Computer Engineering Department
Bilkent University
The focus of this paper is to investigate the possibility of predicting several user and message attributes in text-based real-time online messaging services. For this purpose, a large collection of chat messages is examined. The applicability of various supervised classification techniques for extracting information from the chat messages is evaluated. Two competing models are used for defining the chat mining problem. A term- based approach is used to investigate the user and message attributes in the context of vocabulary use, while a style-based approach is used to examine the chat messages according to the variations in their writing style. We have impressive results in predicting user and message attributes; for example, when we consider 100 users, the identity and gender of chat message authors are predicted with 99.7% and 82.2% accuracy respectively. Moreover, the reverse problem is exploited, and the effect of personal attributes on text-based communications is discussed.
DATE:
February 12, 2007, Monday@ 16:40
PLACE:
EA 409